In [5]:
# TensorBoard Helper Functions and Constants
# Directory to export TensorBoard summary statistics, graph data, etc.
TB_DIR = '/tmp/tensorboard/tf_basics'
def _start_tb(d):
"""
Private function that calls `tensorboard` shell command
args:
d: The desired directory to launch in TensorBoard
"""
!tensorboard --port=6006 --logdir=$d
def start_tensorboard(d=TB_DIR):
"""
Starts TensorBoard from the notebook in a separate thread.
Prevents Jupyter Notebook from halting while TensorBoard runs.
"""
import threading
threading.Thread(target=_start_tb, args=(TB_DIR,)).start()
del threading
def stop_tensorboard():
"""
Kills all TensorBoard processes
"""
!ps -aef | grep "tensorboard" | tr -s ' ' | cut -d ' ' -f2 | xargs kill -KILL
def reset_tensorboard():
stop_tensorboard()
start_tensorboard()
In [6]:
# Import core TensorFlow libraries
import tensorflow as tf
import numpy as np
In [7]:
# `tf.placeholder` creates an "input" node- we will give it value when we run our model
a = tf.placeholder(tf.int32, name="input_a")
b = tf.placeholder(tf.int32, name="input_b")
In [8]:
# `tf.add` creates an addition node
c = tf.add(a, b, name="add")
# `tf.mul` creates a multiplication node
d = tf.mul(a, b, name="multiply")
In [9]:
# Add up the results of the previous two nodes
out = tf.add(c, d, name="output")
In [10]:
# OPTIONAL
# Create a scalar summary, which will log the value we tell it to when executed
# In this case, we'll tell it to save our output value from `out`
# This works in tandem with our SummaryWriter below
# To create the summary, we pass in two parameters:
# 1. A 'tag', which gives a label to the data
# 2. The value(s) we'd like to save
# We also give a `name` to the summary itself (does not affect behavior)
out_summary = tf.scalar_summary("output", out, name="output_summary")
In [11]:
# Start a session
sess = tf.Session()
In [12]:
# Create a "feed_dict" dictionary to define input values
# Keys to dictionary are handles to our placeholders
# Values to dictionary are values we'd like to feed in
feed_dict = { a: 4, b: 3 }
In [13]:
# OPTIONAL
# Opens a `SummaryWriter` object, which can write stats about the graph to disk
# We pass in two parameters into the SummaryWriter constructor
# The first is a string, specifies a directory to write to.
# (Note: `TB_DIR` was specified earlier. "TB" stands for TensorBoard
# The second parameter passes in our graph. This allows us to visualize our graph later
writer = tf.train.SummaryWriter(TB_DIR, graph=sess.graph)
In [14]:
# Execute the graph using `sess.run()`, passing in two parameters:
# The first parameter, `fetches` lists which node(s) we'd like to receive as output
# The second parameter, `feed_dict`, feeds in key-value pairs
# to input or override the value of nodes
# In this case, we run both the output value, as well as its scalar summary
result, summary = sess.run([out, out_summary], feed_dict=feed_dict)
# Print output with fun formatting
print("(({0}*{1}) + ({0}+{1})) = ".format(feed_dict[a], feed_dict[b]) + str(result))
In [15]:
# We add the summary to our SummaryWriter, which will write them to disk:
# Normally, these summaries are used to generate statistics over time
# TensorBoard doesn't do well visualizing single points, so we fake a "global_step"
# With two points, it will generate a line
writer.add_summary(summary, global_step=0)
writer.add_summary(summary, global_step=100)
In [16]:
# Use SummaryWriter.flush() to write all previously added summaries to disk
# This will also flush the list of summaries so that none are added twice
writer.flush()
In [17]:
# We're done! Close down our Session and SummaryWriter to tidy up.
# Note that SummaryWriter.close() automatically calls flush(), so any summaries left will be written to disk
sess.close()
writer.close()
In [ ]:
# Start TensorBoard
start_tensorboard()
Go to your server's ip at port 6006 (replace 1.2.3.4 with your server's ip):
Note that start_tensorboard() is a convenience function defined above. Normally, one would start TensorBoard in a terminal with a command like this (assuming TensorFlow was installed with pip
):
$ tensorboard --logdir=/path/to/SummaryWriter/dir
Explore TensorBoard!
In [18]:
# Once you are done, stop TensorBoard
stop_tensorboard()
Here's the main code all together without as many comments in the way:
# Define inputs
a = tf.placeholder(tf.int32, name="input_a")
b = tf.placeholder(tf.int32, name="input_b")
# First "layer" of transformations
c = tf.add(a, b, name="add")
d = tf.mul(a, b, name="multiply")
# Output node and associated summary
out = tf.add(c, d, name="output")
out_summary = tf.scalar_summary("output", out, name="output_summary")
# Start a session
sess = tf.Session()
# Define our "input" dictionary
feed_dict = { a: 4, b: 3 }
# Open a SummaryWriter
writer = tf.train.SummaryWriter(TB_DIR, graph=sess.graph)
# Compute the values of our output node and its summary
result, summary = sess.run([out, out_summary], feed_dict=feed_dict)
# Write summary to disk
writer.add_summary(summary, global_step=0)
writer.add_summary(summary, global_step=100)
# Close out of session and writer objects
sess.close()
writer.close()
Tensors, simply put, are n-dimensional matrices. A 0-dimensional tensor is a single number (or scalar), a 1-dimensional tensor is a vector, and a 2-dimensional tensor is a standard matrix. Higher dimensional tensors are simply referred to as an "n-D tensor"
Every value that is passed through a TensorFlow model is a Tensor
object- the TensorFlow representation of a tensor.
In [ ]:
# 0-D tensor (scalar)
t_0d_py = 4
# 1-D tensor (vector)
t_1d_py = [1, 2, 3]
# 2-D tensor (matrix)
t_2d_py = [[1, 2],
[3, 4],
[5, 6]]
# 3-D tensor
t_3d_py = [[[0, 0], [0, 1], [0, 2]],
[[1, 0], [1, 1], [1, 2]],
[[2, 0], [2, 1], [2, 2]]]
In [ ]:
# 0-D tensor (scalar)
t_0d_np = np.array(4, dtype=np.int32)
# 1-D tensor (vector)
t_1d_np = np.array([1, 2, 3], dtype=np.int64)
# 2-D tensor (matrix)
t_2d_np = np.array([[1, 2],
[3, 4],
[5, 6]],
dtype=np.float32)
# 3-D tensor
t_3d_np = np.array([[[0, 0], [0, 1], [0, 2]],
[[1, 0], [1, 1], [1, 2]],
[[2, 0], [2, 1], [2, 2]]],
dtype=np.int32)
In general, using np.array
(or np.asarray
) is the recommended way of defining values for tensors by hand in TensorFlow. The primary reason for this is that you can specify the exact data type ("dtype") you'd like the values to be represented with. For example, there's no way to specify a 32-bit integer vs a 64-bit integer with native Python. TensorFlow is tightly integrated with NumPy, and most TensorFlow data types have a corresponding NumPy dtype
:
TensorFlow type | Equivalent NumPy type | Description |
---|---|---|
tf.float32 |
np.float32 |
32 bit floating point. |
tf.float64 |
np.float64 |
64 bit floating point. |
tf.int8 |
np.int8 |
8 bit signed integer. |
tf.int16 |
np.int16 |
16 bit signed integer. |
tf.int32 |
np.int32 |
32 bit signed integer. |
tf.int64 |
np.int64 |
64 bit signed integer. |
tf.uint8 |
np.uint8 |
8 bit unsigned integer. |
tf.string |
N/A | String type, as byte array |
tf.bool |
np.bool |
Boolean. |
tf.complex64 |
np.complex64 |
Complex number made of two 32 bit floating point numbers: real and imaginary parts. |
tf.qint8 |
N/A | 8 bit signed integer used in quantized Ops. |
tf.qint32 |
N/A | 32 bit signed integer used in quantized Ops. |
tf.quint8 |
N/A | 8 bit unsigned integer used in quantized Ops. |
Slightly modified version of this table
In [19]:
# Just to show that they are equivalent
(tf.float32 == np.float32 and
tf.float64 == np.float64 and
tf.int8 == np.int8 and
tf.int16 == np.int16 and
tf.int32 == np.int32 and
tf.int64 == np.int64 and
tf.uint8 == np.uint8 and
tf.bool == np.bool and
tf.complex64 == np.complex64)
Out[19]:
The primary exception to when you should not use np.array()
is when defining a Tensor
of strings. When using strings, just use standard Python lists. It's best practice to include the b
prefix in front of strings to explicitly define the strings as byte-arrays:
In [ ]:
tf_string_tensor = [b"first", b"second", b"third"]
A common term in TensorFlow is a Tensor
object's "shape". A shape value is a list or tuple containing an ordered set of integers. The i-th element in the list describes the length of the i-th dimension in the tensor, while the number of elements in the list defines the dimensionality of the tensor. Here are some examples:
In [ ]:
# Shapes corresponding to scalars
# Note that either lists or tuples can be used
s_0d_list = []
s_0d_tuple = ()
# Shape corresponding to a vector of length 3
s_1d = [3]
# Shape corresponding to a 2-by-3 matrix
s_2d = (2, 3)
# Shape corresponding to a 4-by-4-by-4 cube tensor
s_3d = [4, 4, 4]
s_var = [None, 4, 4]
You can use the tf.shape
Operation to get the shape value of Tensor
objects:
In [20]:
with tf.Session() as sess:
get_shape = tf.shape([[[1, 2, 3], [1, 2, 3]],
[[2, 4, 6], [2, 4, 6]],
[[3, 6, 9], [3, 6, 9]],
[[4, 8, 12], [4, 8, 12]]])
shape = sess.run(get_shape)
print("Shape of tensor: " + str(shape))
In [21]:
my_const = tf.constant(np.array([1, 2, 3], dtype=np.float32))
If a set of values is going to be reused all throughout your graph, using constants is an easy way to place that value directly into the graph (instead of reading from a NumPy array or Python list directly)
Note: all Tensor
objects are immutable. The constant type is simply a convenient way to add basic Tensor
values to a graph.
SparseTensor
TensorFlow has implementations of sparse tensor representations, or tensors whose entries primarily consist of zeros. In some instances, SparseTensor
and Tensor
objects can be intermixed, but more often than not they require more care. Because the SparseTensor
API isn't as robust as the Tensor
API and for the sake of keeping things digestible, we won't cover SparseTensor
objects today.
TensorFlow Operation
objects (also referred to as "Ops" in the TensorFlow documentation- we will avoid that usage today to avoid mixing DevOps and TensorFlow Ops) are nodes that perform compuation on or with Tensor objects. They take as input zero or more Tensor
objects (or objects that can be converted into tensors- see the previous section), and output zero or more tensors. These outputs can then either be returned to the client or passed on to further Operations. Operations are the fundamental building blocks of any TensorFlow graph- their calculations represent nodes, and data flowing from one to the next represents edges.
We've already seen several Operations earlier: tf.add
and tf.mul
are classic examples: they both take in two tensors and output one. When given non-scalar values, they do addition/multiplication element-wise.
In [22]:
# Initialize some tensors
a = np.array([1, 2], dtype=np.int32)
b = np.array([3, 4], dtype=np.int32)
# `tf.add()` creates an "add" Operation and places it in the graph
# The variable `c` will be a handle to the output of the operation
# This output can be passed on to other Operations!
c = tf.add(a, b)
The important thing to remember is that Operations do not execute when created- that's the reason tf.add([1, 2],[3, 4])
doesn't return the value [4, 6]
immediately. It must be passed into a Session.run()
method, which we'll cover in more detail below.
In [23]:
sess = tf.Session()
print(sess.run(c))
c_result = sess.run(c)
The majority of the TensorFlow API is Operations. tf.scalar_summary
and tf.placeholder
were both Operations we used in the first example- remember that we had to run the out_summary
variable in Session.run()
In addition to Operation-specific inputs, each Operation can take in a name
parameter, which can help identify Operations in TensorBoard and other tools.
In [ ]:
c = tf.add(a, b, name="my_add_operation")
Getting into the habit of adding names to your Operations now will save you headaches later on.
When TensorFlow is imported into Python, it automatically creates a Graph
object and makes it the default graph. You can create more graphs as well:
In [24]:
# Create a new graph - constructor takes no parameters
new_graph = tf.Graph()
However, operations (such as tf.add
and tf.mul
) are added to the default graph when created. To add operations to your new graph, use a with
statement along with the graph's as_default()
method. This makes that graph the default while inside of the with
block:
In [25]:
#DEFAULT GRAPH
co = tf.constant(4)
with new_graph.as_default():
a = tf.add(3, 4)
b = tf.mul(a, 2)
other_co = tf.constant(6)
In [31]:
sess = tf.Session(graph=tf.get_default_graph())
sess.run(co)
Out[31]:
The default graph, other than being set to the default, is no different than any other Graph
. If you need to get a handle to the default graph, use the tf.get_default_graph
function:
In [26]:
default_graph = tf.get_default_graph()
Note: get_default_graph()
will return whatever graph is set to the default, so if you are inside of a with g.as_default()
block, get_default_graph()
will return g
:
In [33]:
with new_graph.as_default():
print(new_graph is tf.get_default_graph())
print(new_graph is tf.get_default_graph())
Most TensorFlow models will not require more than one graph per script. However, you may find this useful when defining two independent models side-by-side. Additionally, there are mechanisms to export and import external models and load them in as Graph
objects, which can allow you to feed the output of existing models into your new model (or vice versa). We won't be able to demonstrate these now, but see Graph.as_graph_def()
and tf.import_graph_def
in the TensorFlow API for more information.
As we saw earlier, Session
objects are used to launch and execute graphs. Earlier, we created a session using its default constructor, but it has three optional parameters:
target
specifies the execution engine to use. By default it is the empty string, which causes the Session to use the standard local execution context. Typically, this parameter is only used when using TensorFlow in a distributed settinggraph
specifies which Graph
object the session should run. The default value is None
, which causes the Session
to load in the default graph. Sessions only manage one graph at a time, so executing more than one graph will require more than one sessionconfig
allows users to specify advanced options to configure the session. We won't cover this today, but some things that are available are: limiting the number of CPUs/GPUs used, logging options, and changing optimization of the graph
In [34]:
# A session with the default graph launched
# Equivalent to `tf.Session(graph=tf.get_default_graph())`
sess_default = tf.Session()
# A session with new_graph launched
sess_new = tf.Session(graph=new_graph)
The most important method of a Session
is its run()
function. Earlier in this notebook, we saw basic usage of the two primary parameters to run()
: fetches
and feed_dict
.
fetches
fetches
expects a list of Tensor
and/or Operation
handles (or just a single Tensor
/Operation
). The list specifies what computations we would like TensorFlow to run, as well as what we'd like run()
to output:
In [35]:
sess_default.run(tf.add(3,2))
Out[35]:
TensorFlow will only perform calculations necessary to compute the values specified in fetches
, so it won't waste time if you only need to run a small part of a large, complicated graph.
feed_dict
is an optional parameter to run
, but becomes required when placeholder nodes are included. We saw it used to feed input data to placeholders, but feed_dict
can actually send values to any node. The keys to the dictionary should be handles to Tensor
objects (usually outputs of Operations), and the values should be replacement data:
In [ ]:
# Create Operations, Tensors, etc (using the default graph)
a = tf.add(3, 4)
b = tf.mul(a, 5)
# Define a dictionary that says to replace the value of `a` with 15
replace_dict = {a: 15}
In [ ]:
# Run the session without feed_dict
# Prints (3 + 4) * 5 = 35
print(sess_default.run(b))
In [ ]:
# Run the session, passing in `replace_dict` as the value to `feed_dict`
# Prints 15 * 5 = 75 instead of 7 * 5 = 35
print(sess_default.run(b, feed_dict=replace_dict))
When using placeholders,TensorFlow insists that any calls to Session.run()
include feed_dict
values for all placeholders:
In [36]:
a = tf.placeholder(tf.int32, name="my_placeholder")
b = tf.add(a, 3)
In [38]:
# This raises an error:
try:
sess_default.run(b)
except tf.errors.InvalidArgumentError as e:
print(e.message)
In [39]:
# Create feed dictionary
feed_dict = {a: 8}
# Now it works!
print(sess_default.run(b, feed_dict=feed_dict))
In [40]:
# Closing out the Sessions we opened up
sess_default.close()
sess_new.close()
In [43]:
my_var = tf.Variable(0, name="my_var")
However, even though the object has been created, the value of the Variable
has to be initialized separately with either of the tf.initialize_variables()
or, more commonly, tf.initialize_all_variables()
Operations. Remember that Operations must be passed into Session.run()
to be executed:
In [46]:
Out[46]:
In [45]:
sess = tf.Session()
sess.run(tf.initialize_all_variables())
Having value initialization separated from object creation allows us to reinitialize the variable later if we'd like.
How that the Variable is initialized, we can tweak it's value! Let's do some basic incrementing with the Variable.assign()
method:
In [49]:
increment = my_var.assign(my_var + 1)
for i in range(10):
print(sess.run(increment))
You may notice that if you run the previous code multiple times in the notebook, the value persists and continues to climb. The Variable's state is maintained by the Session
object, and the state will persist unless either the session is close, the Variable is re-initialized, or a new value is assigned to the Variable.
In [50]:
# Re-initialize variables
sess.run(tf.initialize_all_variables())
# Start incrementing, beginning from 0 again
for i in range(10):
print(sess.run(increment))
There are several optional parameters in the Variable
constructor, but one to pay close attention to is trainable
. It takes in a boolean value, which defaults to True
, and specifies to TensorFlow whether the built-in optimization functions (which we will cover in a separate notebook) should affect this Variable
. If a Variable
in you model should not be adjusted during gradient descent, make sure to set its trainable
parameter to False
tf.get_variable()
Though the basic Variable()
constructor is intuitive and good for beginners, eventually we would encourage you to move on to using the tf.get_variable()
method for creating and accessing Variable
objects. It allows users to more easily share Variables across complicated models, where handles to exact Variables can be lost or hard to manage. We will show some examples with tf.get_variable()
in another notebook, but do check out the official how-to, as tf.get_variable()
is best practice.
In [ ]:
sess.close()
In [ ]: